Hankowsky Homework Solutions

Basic Problems

13.16

Toxic Effects of Copper and Zinc. Reconsider the data in Display 10.20 (file: ex1014). (a) Is the experiment a randomized block design or a completely randomized design with factorial treatment structure? (b) Analyze the data using two-way analysis of variance. (c) How does the analysis of variance compare to the regression analysis (i.e., what are the issues in deciding which analysis is more appropriate)?

#Read-In the data
minnow <- Sleuth3::ex1014

#take a look at the dataset 
head(minnow)
##   Copper Zinc Protein
## 1      0    0     201
## 2      0  375     186
## 3      0  750     173
## 4      0 1125     110
## 5      0 1500     115
## 6     38    0     202
#turn copper and zinc into factors 
minnow <- minnow %>%
  mutate(Copper = as.factor(Copper), 
         Zinc = as.factor(Zinc))


#look at the boxplots 
p1 <- minnow %>%
  ggplot() + 
  geom_boxplot(aes(x = as.factor(Copper), y = Protein)) + 
  labs(x = "Copper") + 
  theme_classic()

p2 <- minnow %>%
  ggplot() + 
  geom_boxplot(aes(x = as.factor(Zinc), y = Protein)) + 
  labs(x = "Zinc", y = "") + 
  theme_classic()

grid.arrange(p1, p2, nrow = 1)

#look at the interaction plot
with(minnow, interaction.plot(Copper, Zinc, Protein))

# REGRESSION ANALYSIS
#running the full model 
minnow_lm <- lm(Protein ~ Copper + Zinc, data = minnow) #when I included the interaction term it wasn't really running the subsequent code well 

par(mfrow=c(2,2))
plot(minnow_lm)

summary(minnow_lm)
## 
## Call:
## lm(formula = Protein ~ Copper + Zinc, data = minnow)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -38.6   -8.4    0.8   10.4   31.2 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   190.40      11.66  16.336 2.11e-11 ***
## Copper38        4.20      12.29   0.342 0.736906    
## Copper75       -0.40      12.29  -0.033 0.974430    
## Copper113      -9.00      12.29  -0.733 0.474426    
## Copper150     -18.80      12.29  -1.530 0.145491    
## Zinc375       -23.80      12.29  -1.937 0.070581 .  
## Zinc750       -18.80      12.29  -1.530 0.145491    
## Zinc1125      -57.40      12.29  -4.672 0.000255 ***
## Zinc1500      -67.00      12.29  -5.453 5.31e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.43 on 16 degrees of freedom
## Multiple R-squared:  0.7415, Adjusted R-squared:  0.6122 
## F-statistic: 5.736 on 8 and 16 DF,  p-value: 0.00151
anova(minnow_lm)
## Analysis of Variance Table
## 
## Response: Protein
##           Df  Sum Sq Mean Sq F value    Pr(>F)    
## Copper     4  1685.2   421.3  1.1165 0.3831489    
## Zinc       4 15629.2  3907.3 10.3546 0.0002461 ***
## Residuals 16  6037.6   377.3                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# ANOVA ANALYSIS 
minnow_aov <- aov(Protein ~ Copper + Zinc, data = minnow)

par(mfrow=c(2,2))
plot(minnow_aov)

summary(minnow_aov)
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## Copper       4   1685     421   1.116 0.383149    
## Zinc         4  15629    3907  10.355 0.000246 ***
## Residuals   16   6038     377                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The experiment is a randomized block design, because the beakers containing minnows were randomly allocated to one of the treatments, each of which constitutes a single trial. The results from the ANOVA table for analysis of variance and the regression analysis are the same. It does not matter which method is used to analyse factorial/blocked designed studies, the results will be the same. However, if the study is interested in subgroup comparisons, those may be easier in a regression format with linear combinations.




13.18

El Ni ˜no and Hurricanes. Reconsider the El Ni˜no and Hurricane data set from Exercise 10.28. (a) Regress the log of the storm index on West African wetness (treated as a categorical factor with 2 levels) and El Ni˜no temperature (treated as a categorical factor with 3 levels); retain the sum of squared residuals and the residual degrees of freedom. (b) Regress the log of the storm index onWest African wetness (treated as categorical with 2 levels), El Ni˜no temperature (treated as numerical), and the square of El Ni˜no temperature. Retain the sum of squared residuals and the residual degrees of freedom. (c) Explain why the answers to (a) and (b) are the same. (d) Explain why a test that the coefficient of the temperature-squared term is zero can be used to help decide whether to treat temperature as numerical or categorical.

#Read-In the data
hurricane <- Sleuth3::ex1028

#take a look at the dataset 
head(hurricane)
##   Year  ElNino Temperature WestAfrica Storms Hurricanes StormIndex
## 1 1950    cold          -1          1     13         11        243
## 2 1951    warm           1          0     10          8        121
## 3 1952 neutral           0          1      7          6         97
## 4 1953    warm           1          1     14          6        121
## 5 1954    cold          -1          1     11          8        127
## 6 1955    cold          -1          1     12          9        198
#transform the data as described in the question 
hurricane <- hurricane %>%
  mutate(log_stormindex = log(StormIndex), 
         Temperature_a = as.factor(Temperature), 
         WestAfrica_cat = as.factor(WestAfrica))



# REGRESSION ANALYSIS - PART A
#run the regression 
hurricane_a_lm1 <- lm(log_stormindex ~ WestAfrica_cat + Temperature_a + WestAfrica_cat:Temperature_a, data = hurricane)

par(mfrow=c(2,2))
plot(hurricane_a_lm1)

#refit without interaction and test for interaction effect
hurricane_a_lm2 <- update(hurricane_a_lm1, ~ . - WestAfrica_cat:Temperature_a)
anova(hurricane_a_lm2, hurricane_a_lm1) # no evidence for an interaction effect, p-value = 0.56
## Analysis of Variance Table
## 
## Model 1: log_stormindex ~ WestAfrica_cat + Temperature_a
## Model 2: log_stormindex ~ WestAfrica_cat + Temperature_a + WestAfrica_cat:Temperature_a
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     44 6.8881                           
## 2     42 6.7060  2    0.1821 0.5702 0.5697
#run optimal model, question asks for both West Africa wetness and El Nino temp to be in the model 
hurricane_a_lm <- lm(log_stormindex ~ WestAfrica_cat + Temperature_a, data = hurricane)

summary(hurricane_a_lm)
## 
## Call:
## lm(formula = log_stormindex ~ WestAfrica_cat + Temperature_a, 
##     data = hurricane)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.88582 -0.28625  0.02172  0.26209  0.85711 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       4.5320     0.1270  35.676  < 2e-16 ***
## WestAfrica_cat1   0.4941     0.1275   3.874 0.000352 ***
## Temperature_a0   -0.1497     0.1455  -1.029 0.309063    
## Temperature_a1   -0.5933     0.1506  -3.940 0.000288 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3957 on 44 degrees of freedom
## Multiple R-squared:  0.5278, Adjusted R-squared:  0.4956 
## F-statistic:  16.4 on 3 and 44 DF,  p-value: 2.695e-07
anova(hurricane_a_lm)
## Analysis of Variance Table
## 
## Response: log_stormindex
##                Df Sum Sq Mean Sq F value    Pr(>F)    
## WestAfrica_cat  1 4.9843  4.9843 31.8389 1.128e-06 ***
## Temperature_a   2 2.7158  1.3579  8.6741 0.0006673 ***
## Residuals      44 6.8881  0.1565                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(hurricane_a_lm)

# REGRESSION ANALYSIS - PART B 
hurricane_b_lm1 <- lm(log_stormindex ~ WestAfrica_cat + Temperature + WestAfrica_cat:Temperature, data = hurricane)

par(mfrow=c(2,2))
plot(hurricane_b_lm1)

#refit without interaction and test for interaction effect
hurricane_b_lm2 <- update(hurricane_b_lm1, ~ . - WestAfrica_cat:Temperature)
anova(hurricane_b_lm2, hurricane_b_lm1) # no evidence for an interaction effect, p-value = 0.60
## Analysis of Variance Table
## 
## Model 1: log_stormindex ~ WestAfrica_cat + Temperature
## Model 2: log_stormindex ~ WestAfrica_cat + Temperature + WestAfrica_cat:Temperature
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     45 7.1163                           
## 2     44 7.0724  1   0.04398 0.2736 0.6035
#run optimal model, question asks for both West Africa wetness and El Nino temp to be in the model 
hurricane_b_lm <- lm(log_stormindex ~ WestAfrica_cat + Temperature, data = hurricane)

summary(hurricane_b_lm)
## 
## Call:
## lm(formula = log_stormindex ~ WestAfrica_cat + Temperature, data = hurricane)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.79350 -0.26273 -0.00785  0.29715  0.80576 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      4.29001    0.07472  57.416  < 2e-16 ***
## WestAfrica_cat1  0.47897    0.12756   3.755 0.000496 ***
## Temperature     -0.29998    0.07563  -3.966 0.000259 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3977 on 45 degrees of freedom
## Multiple R-squared:  0.5122, Adjusted R-squared:  0.4905 
## F-statistic: 23.62 on 2 and 45 DF,  p-value: 9.676e-08
anova(hurricane_b_lm)
## Analysis of Variance Table
## 
## Response: log_stormindex
##                Df Sum Sq Mean Sq F value    Pr(>F)    
## WestAfrica_cat  1 4.9843  4.9843  31.518 1.163e-06 ***
## Temperature     1 2.4876  2.4876  15.730 0.0002591 ***
## Residuals      45 7.1163  0.1581                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(hurricane_b_lm)

# PART C - Compare the SS residuals and residual df
a1 <- anova(hurricane_a_lm)
a2 <- anova(hurricane_b_lm)

#extracting the SS residuals 
a1["Residuals", "Sum Sq"]
## [1] 6.888113
a2["Residuals", "Sum Sq"]
## [1] 7.116333
#extracting the residual df
a1["Residuals", "Df"]
## [1] 44
a2["Residuals", "Df"]
## [1] 45

Part A: A log-transformation was conducted based problem statement in the question. A saturated model was fit to the log-transformed storm index data with West African wetness, temperature (categorical) and the interaction of West African wetness and temperature as explanatory variables. A reduced model was fit without the interaction term, and there was no evidence for an interaction effect (Extra Sum of Squares F-test, p-value = 0.56). The diagnostic plots show no problematic patterns in the model fit.


Part B: A log-transformation was conducted based problem statement in the question. A saturated model was fit to the log-transformed storm index data with West African wetness, temperature (continuous) and the interaction of West African wetness and temperature as explanatory variables. A reduced model was fit without the interaction term, and there was no evidence for an interaction effect (Extra Sum of Squares F-test, p-value = 0.60). The diagnostic plots show no problematic patterns in the model fit.


Part C: The sum of squared residuals and the residual degrees of freedom were slightly different in my analysis. The degrees of freedom are different for numeric and categorical variables. The degrees of freedom for a categorical variable are dependent on the number of levels in the variable, whereas thea continuous variable only has one degree of freedom.


Part D: The temperature term should be treated as categorical. It has three levels (1, 0, -1). If you were to square the temperature it would essentially get rid of the information captured in that term. It would not be appropriate to treat the temperature term as continuous in this study.




Supplemental Problems

13.17

Dinosaur Extinctions—An Observational Study. About 65 million years ago, the dinosaurs suffered a mass extinction virtually overnight (in geologic time).What happened in this period, the Cretaceous–Tertiary (KT) boundary, that could have produced such calamity? Among many clues, one that all scientists regard as crucial is a layer of iridium-rich dust that was deposited over much of the earth at that time. Data from one of the first discoveries of this layer are shown inDisplay 13.22. The diagram traces iridium concentrations in soil samples taken from extensive shale and limestone deposits at Gubbio, Italy. Iridium (parts per trillium) is graphed against the depth at which samples were taken, with depth giving a crude measure of historic time.

Iridium is a trace element present in most soils. But concentrations as high as those at the peak near 347 meters, at the KT boundary, can only occur in association with highly unusual events, such as volcanic eruptions or meteor impacts. So the theory is that such an event caused a massive dust cloud that blanketed the earth for years, killing off animals and their food sources. Butwas the cause a meteor (coming down on the Yucatan peninsula in central America) or volcanic eruptions (centered in southern China)? Two articles debating the issue appeared in the October 1990 issue of Scientific American—W. Alvarez and F. Asaro, “What Caused the Mass Extinction? An Extraterrestrial Impact,” Scientific American 263(4): 76–84, and E. Courtillot, “What Caused the Mass Extinction? A Volcanic Eruption,” Scientific American 263(4): 85–93. A crucial issue in the debate is the shape of the iridium trace because the timing and extent of the source give clues to its nature.

The raw data are provided in Display 13.23. (a) Fit the two-way, saturated model to the untransformed data and draw a plot of the residuals versus estimated means to see if a transformation is suggested. (b) Fit the two-way model (after transformation if appropriate) and test for interaction, using a multiple regression routine. (c) If appropriate with your statistical software, repeat part (b) using an analysis of variance routine.

#Read-In the data
dinos <- Sleuth3::ex1317

#take a look at the dataset 
head(dinos)
##   Iridium    Strata DepthCat
## 1      75 Limestone        1
## 2     200 Limestone        1
## 3     120 Limestone        2
## 4     310 Limestone        2
## 5     290 Limestone        3
## 6     450 Limestone        3
#transforming the data for later in the problem 
dinos <- dinos %>%
  mutate(log_iridium = log(Iridium))

#take a look at the boxplots 
p31 <- dinos %>%
  ggplot() + 
  geom_boxplot(aes(x = Strata, y = Iridium)) + 
  theme_classic()

p32 <- dinos %>%
  ggplot() + 
  geom_boxplot(aes(x = DepthCat, y = Iridium)) + 
  labs(x = "Depth Category", y = "") + 
  theme_classic()

grid.arrange(p31, p32, nrow = 1)

# PART A 
dinos_a_lm1 <- lm(Iridium ~ Strata + DepthCat + Strata:DepthCat, data = dinos)

par(mfrow=c(2,2))
plot(dinos_a_lm1)
## Warning: not plotting observations with leverage one:
##   18, 21

#refit without interaction and test for interaction effect
dinos_a_lm2 <- update(dinos_a_lm1, ~ . - Strata:DepthCat)
anova(dinos_a_lm2, dinos_a_lm1) # no evidence for an interaction effect, p-value = 0.43
## Analysis of Variance Table
## 
## Model 1: Iridium ~ Strata + DepthCat
## Model 2: Iridium ~ Strata + DepthCat + Strata:DepthCat
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     21 268401                           
## 2     16 202878  5     65523 1.0335 0.4314
#run optimal model
dinos_a_lm <- lm(Iridium ~ Strata + DepthCat, data = dinos)

summary(dinos_a_lm)
## 
## Call:
## lm(formula = Iridium ~ Strata + DepthCat, data = dinos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -236.713  -64.993   -2.625   61.504  201.642 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   143.64      60.88   2.359  0.02807 *  
## StrataShale   155.72      45.24   3.442  0.00244 ** 
## DepthCat2      52.79      86.67   0.609  0.54905    
## DepthCat3     383.07      75.97   5.042 5.43e-05 ***
## DepthCat4      76.18      80.74   0.944  0.35613    
## DepthCat5    -100.58      71.52  -1.406  0.17424    
## DepthCat6     -93.93      75.97  -1.236  0.22998    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 113.1 on 21 degrees of freedom
## Multiple R-squared:  0.7676, Adjusted R-squared:  0.7012 
## F-statistic: 11.56 on 6 and 21 DF,  p-value: 9.872e-06
anova(dinos_a_lm)
## Analysis of Variance Table
## 
## Response: Iridium
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## Strata     1  76407   76407  5.9781   0.02339 *  
## DepthCat   5 810293  162059 12.6796 9.337e-06 ***
## Residuals 21 268401   12781                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(dinos_a_lm)

# PART B 
dinos_b_lm1 <- lm(log_iridium ~ Strata + DepthCat + Strata:DepthCat, data = dinos)

par(mfrow=c(2,2))
plot(dinos_b_lm1)
## Warning: not plotting observations with leverage one:
##   18, 21

#refit without interaction and test for interaction effect
dinos_b_lm2 <- update(dinos_b_lm1, ~ . - Strata:DepthCat)
anova(dinos_b_lm2, dinos_b_lm1) # no evidence for an interaction effect, p-value = 0.79
## Analysis of Variance Table
## 
## Model 1: log_iridium ~ Strata + DepthCat
## Model 2: log_iridium ~ Strata + DepthCat + Strata:DepthCat
##   Res.Df     RSS Df Sum of Sq      F Pr(>F)
## 1     21 10.3020                           
## 2     16  8.9601  5    1.3419 0.4792 0.7866
#run optimal model
dinos_b_lm <- lm(log_iridium ~ Strata + DepthCat, data = dinos)

summary(dinos_b_lm)
## 
## Call:
## lm(formula = log_iridium ~ Strata + DepthCat, data = dinos)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.33228 -0.29496  0.01974  0.43138  0.73288 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   4.7827     0.3772  12.680 2.62e-11 ***
## StrataShale   0.7010     0.2803   2.501   0.0207 *  
## DepthCat2     0.4092     0.5370   0.762   0.4546    
## DepthCat3     1.2465     0.4707   2.648   0.0150 *  
## DepthCat4     0.5448     0.5002   1.089   0.2884    
## DepthCat5    -0.2929     0.4431  -0.661   0.5157    
## DepthCat6    -0.8410     0.4707  -1.787   0.0884 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7004 on 21 degrees of freedom
## Multiple R-squared:   0.59,  Adjusted R-squared:  0.4729 
## F-statistic: 5.038 on 6 and 21 DF,  p-value: 0.002422
anova(dinos_b_lm)
## Analysis of Variance Table
## 
## Response: log_iridium
##           Df  Sum Sq Mean Sq F value   Pr(>F)   
## Strata     1  1.8727 1.87267  3.8173 0.064174 . 
## DepthCat   5 12.9550 2.59101  5.2816 0.002698 **
## Residuals 21 10.3020 0.49057                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(dinos_b_lm)

# PART C 
dinos_c_aov1 <- aov(log_iridium ~ Strata + DepthCat + Strata:DepthCat, data = dinos)

par(mfrow=c(2,2))
plot(dinos_c_aov1)
## Warning: not plotting observations with leverage one:
##   18, 21

#refit without interaction and test for interaction effect
dinos_c_aov2 <- update(dinos_c_aov1, ~ . - Strata:DepthCat)
anova(dinos_c_aov2, dinos_c_aov1) # no evidence for an interaction effect, p-value = 0.79
## Analysis of Variance Table
## 
## Model 1: log_iridium ~ Strata + DepthCat
## Model 2: log_iridium ~ Strata + DepthCat + Strata:DepthCat
##   Res.Df     RSS Df Sum of Sq      F Pr(>F)
## 1     21 10.3020                           
## 2     16  8.9601  5    1.3419 0.4792 0.7866
#run optimal model
dinos_c_aov <- aov(log_iridium ~ Strata + DepthCat, data = dinos)

summary(dinos_c_aov)
##             Df Sum Sq Mean Sq F value Pr(>F)   
## Strata       1  1.873  1.8727   3.817 0.0642 . 
## DepthCat     5 12.955  2.5910   5.282 0.0027 **
## Residuals   21 10.302  0.4906                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(dinos_c_aov)

Part A: A regression model was fit with the raw iridium data with stratum, depth category and the interaction of stratum and depth category as explanatory variables. A reduced model was fit without the interaction term, and there was no evidence for an interaction effect (Extra Sum of Squares F-test, p-value = 0.43). However, the residuals for this model were extraordinarly large, so a log-transformation was conducted for Part B of the problem.


Part B: A log-transformation was conducted based on the residual plot from Part A. A saturated model was fit to the log-transformed iridium data with stratum, depth category and the interaction of stratum and depth category as explanatory variables. A reduced model was fit without the interaction term, and there was no evidence for an interaction effect (Extra Sum of Squares F-test, p-value = 0.79). The diagnostic plots show no problematic patterns in the model fit.


Part C: A saturated model was fit to the log-transformed iridium data with stratum, depth category and the interaction of stratum and depth category as explanatory variables. A reduced model was fit without the interaction term, and there was no evidence for an interaction effect (Extra Sum of Squares F-test, p-value = 0.79). The diagnostic plots show no problematic patterns in the model fit. The exact same results as the regression routine in Part B.




13.21

Pygmalion. The term Pygmalion effect (used in the Section 13.1.2 data problem) originated with the 1960s’ work of psychologist Robert Rosenthal and elementary school principal Lenore Jacobson, who conducted a randomized block experiment on the children in Jacobson’s elementary school. In each classroom (i.e., for each block of students), they assigned roughly 20% of the students to a “likely to excel” treatment group and the remainder to a control group. After all students took a standardized test at the begining of the school year, Rosenthal and Jacobson identified to teachers those students that had been indicated by the intelligence test as very likely to excel in the coming year. This was false information, though. The “likely to excel” students were simply those who had been assigned by a chance mechanism to the “likely to excel” treatment group. The researchers deceived the teachers with this false information to create artificial expectations and to explore the effect of those expectations on student test score gains. Display 13.26 is a partial listing of a data set simulated to match the summary statistics and conclusions from Table A-7 of Rosenthal and Jacobson’s report (R. Rosenthal, and L. Jacobson, 1968, Pygmalion in the Classroom: Teacher Expectation and Pupil’s Intellectual Development, Holt, Rinehart, andWinston, Inc.). Analyze the data to summarize the evidence that teachers’ expectations affect student test score gains. Write a statistical report of your conclusions.

#Read-In the data
pygmalion <- Sleuth3::ex1321

#take a look at the dataset 
head(pygmalion)
##   Student Grade Class Treatment Gain
## 1       1     1    1a   control   -4
## 2       2     1    1a   control    0
## 3       3     1    1a   control  -19
## 4       4     1    1a   control   24
## 5       5     1    1a   control   19
## 6       6     1    1a   control   10
#take a look at the boxplots 
pygmalion %>%
  ggplot() + 
  geom_boxplot(aes(x = Treatment, y = Gain)) + 
  labs(title = "Boxplot of Student Test Score Gains vs Treatment") +
  theme_classic()

pygmalion %>%
  ggplot() + 
  geom_boxplot(aes(x = Class, y = Gain, color = Treatment)) + 
  labs(title = "Boxplot of Student Test Score Gains vs Classroom") + 
  theme_classic()

#take a look at interaction plot 
with(pygmalion, interaction.plot(Class, Treatment, Gain))

# REGRESSION ANALYSIS
pygmalion_lm1 <- lm(Gain ~ Treatment + Class + Treatment:Class, data = pygmalion)

summary(pygmalion_lm1)
## 
## Call:
## lm(formula = Gain ~ Treatment + Class + Treatment:Class, data = pygmalion)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -46.467  -7.572   0.419   7.150  39.722 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)   
## (Intercept)                  6.7647     2.9507   2.293  0.02260 * 
## Treatmentpygmalion          11.2353    12.5188   0.897  0.37022   
## Class1b                      6.7020     4.3098   1.555  0.12104   
## Class1c                      9.5478     4.2376   2.253  0.02501 * 
## Class2a                     -2.6068     4.0616  -0.642  0.52151   
## Class2b                      3.0924     4.3908   0.704  0.48182   
## Class2c                      1.0924     4.3908   0.249  0.80369   
## Class3a                      7.5686     4.5871   1.650  0.10004   
## Class3b                     -6.8314     4.3098  -1.585  0.11405   
## Class3c                     -4.3032     4.4825  -0.960  0.33787   
## Class4a                      0.5131     4.1146   0.125  0.90085   
## Class4b                     -7.2022     4.2376  -1.700  0.09030 . 
## Class4c                     -7.6980     4.3098  -1.786  0.07513 . 
## Class5a                     12.1728     4.2376   2.873  0.00438 **
## Class5b                      8.2353     4.8485   1.699  0.09050 . 
## Class6a                      7.8353     4.0134   1.952  0.05188 . 
## Class6b                      1.5430     4.4825   0.344  0.73093   
## Class6c                      0.1520     4.5871   0.033  0.97360   
## Treatmentpygmalion:Class1b  -1.7020    14.2686  -0.119  0.90514   
## Treatmentpygmalion:Class1c  13.4522    15.4913   0.868  0.38592   
## Treatmentpygmalion:Class2a   7.1068    13.7543   0.517  0.60577   
## Treatmentpygmalion:Class2b -14.0924    14.7184  -0.957  0.33914   
## Treatmentpygmalion:Class2c  -5.0924    14.7184  -0.346  0.72960   
## Treatmentpygmalion:Class3a -15.5686    13.6952  -1.137  0.25657   
## Treatmentpygmalion:Class3b  11.8314    17.7371   0.667  0.50528   
## Treatmentpygmalion:Class3c -17.4968    14.0609  -1.244  0.21439   
## Treatmentpygmalion:Class4a -11.3131    13.9480  -0.811  0.41799   
## Treatmentpygmalion:Class4b -11.4645    14.6735  -0.781  0.43527   
## Treatmentpygmalion:Class4c  -2.3020    14.2686  -0.161  0.87195   
## Treatmentpygmalion:Class5a -11.7728    13.9848  -0.842  0.40059   
## Treatmentpygmalion:Class5b  -9.9853    14.4404  -0.691  0.48982   
## Treatmentpygmalion:Class6a -12.5853    14.1819  -0.887  0.37560   
## Treatmentpygmalion:Class6b -10.2930    14.3217  -0.719  0.47291   
## Treatmentpygmalion:Class6c -12.1520    14.7782  -0.822  0.41160   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.17 on 286 degrees of freedom
## Multiple R-squared:  0.269,  Adjusted R-squared:  0.1846 
## F-statistic: 3.189 on 33 and 286 DF,  p-value: 8.209e-08
par(mfrow=c(2,2))
plot(pygmalion_lm1)
## Warning: not plotting observations with leverage one:
##   18, 150

#refit without interaction and test for interaction effect
pygmalion_lm2 <- update(pygmalion_lm1, ~ . - Treatment:Class)
anova(pygmalion_lm2, pygmalion_lm1) # no evidence for an interaction effect, p-value = 0.11
## Analysis of Variance Table
## 
## Model 1: Gain ~ Treatment + Class
## Model 2: Gain ~ Treatment + Class + Treatment:Class
##   Res.Df   RSS Df Sum of Sq     F Pr(>F)
## 1    302 45820                          
## 2    286 42332 16    3488.4 1.473 0.1086
#refit without class and test for class effect 
pygmalion_lm3 <- update(pygmalion_lm2, ~ . - Class)
anova(pygmalion_lm3, pygmalion_lm2) # strong evidence for a class effect, p-value < 0.0001
## Analysis of Variance Table
## 
## Model 1: Gain ~ Treatment
## Model 2: Gain ~ Treatment + Class
##   Res.Df   RSS Df Sum of Sq     F    Pr(>F)    
## 1    318 57106                                 
## 2    302 45820 16     11286 4.649 2.228e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#run optimal model
pygmalion_lm <- lm(Gain ~ Treatment + Class, data = pygmalion)

summary(pygmalion_lm)
## 
## Call:
## lm(formula = Gain ~ Treatment + Class, data = pygmalion)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.714  -7.310  -0.303   6.885  42.698 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)   
## (Intercept)          7.1885     2.9049   2.475  0.01389 * 
## Treatmentpygmalion   3.6076     1.7457   2.067  0.03963 * 
## Class1b              7.5257     4.0605   1.853  0.06480 . 
## Class1c             11.4662     4.1070   2.792  0.00557 **
## Class2a              0.5057     3.8212   0.132  0.89480   
## Class2b              1.5278     4.1712   0.366  0.71441   
## Class2c              1.1161     4.1712   0.268  0.78921   
## Class3a              3.9685     4.0468   0.981  0.32756   
## Class3b             -6.0389     4.2323  -1.427  0.15465   
## Class3c             -7.4684     4.1242  -1.811  0.07115 . 
## Class4a             -0.7119     3.8866  -0.183  0.85480   
## Class4b             -8.2318     4.0554  -2.030  0.04325 * 
## Class4c             -7.0006     4.0605  -1.724  0.08572 . 
## Class5a             10.7621     3.9693   2.711  0.00709 **
## Class5b              7.1379     4.4077   1.619  0.10640   
## Class6a              6.5853     3.8456   1.712  0.08785 . 
## Class6b              0.4921     4.1776   0.118  0.90631   
## Class6c             -1.1767     4.3136  -0.273  0.78521   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.32 on 302 degrees of freedom
## Multiple R-squared:  0.2088, Adjusted R-squared:  0.1642 
## F-statistic: 4.687 on 17 and 302 DF,  p-value: 8.209e-09
anova(pygmalion_lm)
## Analysis of Variance Table
## 
## Response: Gain
##            Df Sum Sq Mean Sq F value    Pr(>F)    
## Treatment   1    803  803.08   5.293   0.02209 *  
## Class      16  11286  705.37   4.649 2.228e-08 ***
## Residuals 302  45820  151.72                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mfrow=c(2,2))
plot(pygmalion_lm)

confint(pygmalion_lm)
##                          2.5 %     97.5 %
## (Intercept)          1.4720406 12.9048877
## Treatmentpygmalion   0.1723438  7.0429480
## Class1b             -0.4647547 15.5161861
## Class1c              3.3842513 19.5482325
## Class2a             -7.0138819  8.0252837
## Class2b             -6.6803788  9.7360462
## Class2c             -7.0921435  9.3242815
## Class3a             -3.9950751 11.9320302
## Class3b            -14.3673784  2.2894944
## Class3c            -15.5840869  0.6473554
## Class4a             -8.3600906  6.9363598
## Class4b            -16.2122400 -0.2513132
## Class4c            -14.9910705  0.9898703
## Class5a              2.9510551 18.5731377
## Class5b             -1.5357916 15.8116371
## Class6a             -0.9822727 14.1527959
## Class6b             -7.7288265  8.7130061
## Class6c             -9.6652752  7.3119553
#randomization distribution 
obs = summary(lm(Gain ~ Treatment + Class, data = pygmalion))$coefficients["Treatmentpygmalion", "t value"]

nulldist = do(15000) * summary(lm(Gain ~ shuffle(Treatment) + shuffle(Class), data = pygmalion))$coefficients["shuffle(Treatment)pygmalion", "t value"]

histogram(~result, groups = result >= obs, v = obs, data = nulldist)

tally(~result >= obs, format = "proportion", data = nulldist)
## result >= obs
##       TRUE      FALSE 
## 0.01986667 0.98013333

The Pygmalion treatments adds an estimated 3.61 points to a students test score (95% confidence interval: 0.147 to 12.90 points). The study provides moderate evidence that the effect is real (one-sided p-value = 0.021).